47 research outputs found
A Discriminatively Learned CNN Embedding for Person Re-identification
We revisit two popular convolutional neural networks (CNN) in person
re-identification (re-ID), i.e, verification and classification models. The two
models have their respective advantages and limitations due to different loss
functions. In this paper, we shed light on how to combine the two models to
learn more discriminative pedestrian descriptors. Specifically, we propose a
new siamese network that simultaneously computes identification loss and
verification loss. Given a pair of training images, the network predicts the
identities of the two images and whether they belong to the same identity. Our
network learns a discriminative embedding and a similarity measurement at the
same time, thus making full usage of the annotations. Albeit simple, the
learned embedding improves the state-of-the-art performance on two public
person re-ID benchmarks. Further, we show our architecture can also be applied
in image retrieval
Parameter-Efficient Person Re-identification in the 3D Space
People live in a 3D world. However, existing works on person
re-identification (re-id) mostly consider the semantic representation learning
in a 2D space, intrinsically limiting the understanding of people. In this
work, we address this limitation by exploring the prior knowledge of the 3D
body structure. Specifically, we project 2D images to a 3D space and introduce
a novel parameter-efficient Omni-scale Graph Network (OG-Net) to learn the
pedestrian representation directly from 3D point clouds. OG-Net effectively
exploits the local information provided by sparse 3D points and takes advantage
of the structure and appearance information in a coherent manner. With the help
of 3D geometry information, we can learn a new type of deep re-id feature free
from noisy variants, such as scale and viewpoint. To our knowledge, we are
among the first attempts to conduct person re-identification in the 3D space.
We demonstrate through extensive experiments that the proposed method (1) eases
the matching difficulty in the traditional 2D space, (2) exploits the
complementary information of 2D appearance and 3D structure, (3) achieves
competitive results with limited parameters on four large-scale person re-id
datasets, and (4) has good scalability to unseen datasets.Comment: The code is available at https://github.com/layumi/person-reid-3
Adaptive Boosting for Domain Adaptation: Towards Robust Predictions in Scene Segmentation
Domain adaptation is to transfer the shared knowledge learned from the source
domain to a new environment, i.e., target domain. One common practice is to
train the model on both labeled source-domain data and unlabeled target-domain
data. Yet the learned models are usually biased due to the strong supervision
of the source domain. Most researchers adopt the early-stopping strategy to
prevent over-fitting, but when to stop training remains a challenging problem
since the lack of the target-domain validation set. In this paper, we propose
one efficient bootstrapping method, called Adaboost Student, explicitly
learning complementary models during training and liberating users from
empirical early stopping. Adaboost Student combines the deep model learning
with the conventional training strategy, i.e., adaptive boosting, and enables
interactions between learned models and the data sampler. We adopt one adaptive
data sampler to progressively facilitate learning on hard samples and aggregate
"weak" models to prevent over-fitting. Extensive experiments show that (1)
Without the need to worry about the stopping time, AdaBoost Student provides
one robust solution by efficient complementary model learning during training.
(2) AdaBoost Student is orthogonal to most domain adaptation methods, which can
be combined with existing approaches to further improve the state-of-the-art
performance. We have achieved competitive results on three widely-used scene
segmentation domain adaptation benchmarks.Comment: 10 pages, 7 tables, 5 figure
Unsupervised Scene Adaptation with Memory Regularization in vivo
We consider the unsupervised scene adaptation problem of learning from both
labeled source data and unlabeled target data. Existing methods focus on
minoring the inter-domain gap between the source and target domains. However,
the intra-domain knowledge and inherent uncertainty learned by the network are
under-explored. In this paper, we propose an orthogonal method, called memory
regularization in vivo to exploit the intra-domain knowledge and regularize the
model training. Specifically, we refer to the segmentation model itself as the
memory module, and minor the discrepancy of the two classifiers, i.e., the
primary classifier and the auxiliary classifier, to reduce the prediction
inconsistency. Without extra parameters, the proposed method is complementary
to the most existing domain adaptation methods and could generally improve the
performance of existing methods. Albeit simple, we verify the effectiveness of
memory regularization on two synthetic-to-real benchmarks: GTA5 -> Cityscapes
and SYNTHIA -> Cityscapes, yielding +11.1% and +11.3% mIoU improvement over the
baseline model, respectively. Besides, a similar +12.0% mIoU improvement is
observed on the cross-city benchmark: Cityscapes -> Oxford RobotCar.Comment: 7 pages, 4 figures, 6 table
Improving Person Re-identification by Attribute and Identity Learning
Person re-identification (re-ID) and attribute recognition share a common
target at learning pedestrian descriptions. Their difference consists in the
granularity. Most existing re-ID methods only take identity labels of
pedestrians into consideration. However, we find the attributes, containing
detailed local descriptions, are beneficial in allowing the re-ID model to
learn more discriminative feature representations. In this paper, based on the
complementarity of attribute labels and ID labels, we propose an
attribute-person recognition (APR) network, a multi-task network which learns a
re-ID embedding and at the same time predicts pedestrian attributes. We
manually annotate attribute labels for two large-scale re-ID datasets, and
systematically investigate how person re-ID and attribute recognition benefit
from each other. In addition, we re-weight the attribute predictions
considering the dependencies and correlations among the attributes. The
experimental results on two large-scale re-ID benchmarks demonstrate that by
learning a more discriminative representation, APR achieves competitive re-ID
performance compared with the state-of-the-art methods. We use APR to speed up
the retrieval process by ten times with a minor accuracy drop of 2.92% on
Market-1501. Besides, we also apply APR on the attribute recognition task and
demonstrate improvement over the baselines.Comment: Accepted to Pattern Recognition (PR